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ABSTRACT 



The invention greatly reduces common bus contention 
by allowing the semaphore test bit and set operations to 
be performed on each CPU*s local bus. The semaphore 
lock bits are stored locally in high speed SRAM on each 
CPU, and coherency of the lock bits is maintained 
through a bus monitoring logic circuit on each CPU. A 
CPU wishing to take possession of a semaphore per- 
forms a local read of its semaphore memory^ and spins 
locally until the lock bit is reset at which time it per- 
forms a local write to set the bit. When the semaphore 
lock bit is written, it will be updated locally, and at the 
same time the write operation will be sent out over the 
common bus. The bus monitoring logic on every other 
CPU will recognize the write operation and simulta- 
neously update the corresponding lock bit in each local 
semaphore memory. This ability to read spin locally 
relieves the common bus from the great amount of 
traffic that occurs in typical systems that maintain the 
semaphore lock bits in common global memory. 

3 aaims, 1 Drawing Sheet 
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in several ways: Many semaphore operations per sec- 

SENfAPHORE MEMORY TO REDUCE COMMON end can result in a high degree of contention for the 

BUS CONTENTION TO GLOBAL MEMORY WITH common bus and therefore the semaphore operations 

LOCALIZED SEMAPHORES IN A can take a significant amount of time compared to the 

MULTIPROCESSOR SYSTEM 5 code which they protect. The indivisible bus operations 

can seriously affect the through-put of the system bus. 

This invention generally relates to an apparatus used The duration of the semaphore operation effects the 

to support semaphore operations in a common bus. contention for the semaphore itself, 

multiprocessor computer system. SUMMARY OF THE INVENTION 

BACKGROUND OF THE INVENTION ,„ ^^^^ j^e above described dirficulties with 

In a computer system which allows several processes known^prior an it is the object of the present invention 

to co-exist (e.g. a multi-tasking operating system or to provide a computer architecture and method to rc- 

multi-processor system) a means to synchronize the lieve the common memory bus in a multiprocessor sys- 

processcs is needed. The semaphore mechanism is a 15 tcm from much of the high degree of data transfer due 

common approach to solving this need. Although the to semaphored data structures. To meet this object, the 

term semaphore has been used to describe several syn- present invention provides a multiprocessor system 

chronization mechanisms we will use it only to mean employing a semaphore mechanism, comprising: 

the algorithms devised by E. W. Dijkstra. a global memory for storing semaphore protected 

The essence of Dijkstra*s semaphores is a data struc- 20 data structures; 

ture and algorithm which control the use of resources. a plurality of CPU boards, each of the CPU boards 

In computers semaphores are used to protect data struc- containing a central processing unit and local memory 

lures in the computer's memory, to arbitrate the use of for storing semaphores which are identical to sema- 

de vices and to synchronize processes with external phores stored in the local memory of each of the other 

events. 23 of the CPU boards, each semaphore including a lock bit 

Previous computer systems have implemented sema- which can be set and reset in dependence of whether a 

phores using a software or microcode algorithm. In data structure in global memory Is being used by a one 

multiprocessor systems the performance of the sema- of the central processing units; 

phore mechanism determines the degree of parallelism a common bus interconnecting the global memory 

which can be achieved and thus the system perfor- 30 and the plurality of CPU boards; 

Qiance. spin loop means in each central processing unit for 

Each semaphore includes a counter which describes gaining access to a given semaphore stored in local 

the quantity ofpariicular resources which are available. memory by continuously checking the status of the 

and if no resources are available it may Include a list of local lock bit associated with the given semaphore until 

processes waiting for a resource to be made available. 35 it is determined that the local lock bit is reset, locking 

Dijkstra defined two operations which act on sema- the common bus to prevent other central processing 

phores: units from gaining access to the common bus, reading 

P(sema); Allocate a resource to the process. Blocking the local lock bit to ensure it is still reset, and thereafter 

the process if necessary until a resource is avail- setting the associated local lock bit for gaining access to 
able. While a process is blocked the processor may 40 the given semaphore; and 

be running another process. bus monitoring logic means on each CPU board cou- 
V(sema): Return a resource allocated using P to the pled to the central processing unit and the local mem- 
pool of resources and if a process is wailing for the ory of that CPU board for writing to the common bus 
resource allocate the resource to that process that the local set lock bit, and for monitoring the common 
make it runable. 45 bus for a set lock bit written to the common bus by 
The P operation decrements the counter and then another bus monitoring logic means on one of the other 
decides if it needs to block the process. The V operation CPU boards and writing the set lock bit written onto 
increments the counter and decides if there is a process the common bus by the other bus monitoring logic 
waiting to run. Semaphores have been implemented on means into local memory to ensure coherency of the 
existing systems using software to implement the algo- 50 lock bits throughout the multiprocessor system, 
rithms. These algorithms are difficult to implement in The invention thus greatly reduces common bus con- 
software since the semaphore data structure itself is a lention by allowing the semaphore test bit and set oper- 
resource which needs to be protected. ations to be performed on each CPU's local bus. The 
The normal hardware support commonly used to semaphore lock bits are stored locally in high speed 
provide synchronization of semaphores is an indivisible 55 SRAM on each CPU. and coherency of the lock bits is 
memory read/modify /write operation. Software can maintained through a bus monitoring logic circuit on 
build on this by repeatedly doing an indivisible test and each CPU. 

set operation on a lock bit in memory until it finds that A CPU wishing to take possession of a semaphore 

the bit was previously cleared. This is called a spin loop performs a local read of its semaphore memory, and 
because contention for the lock bit is resolved by the 60 spins locally until the lock bit is reset at which time it 

processor spinning in a light loop waiting for the re- performs a local write to set the bit. 

source to become available. When the semaphore lock bit is written, it will be 

If a spin lock is used to protect the semaphore data updated locally, and at the same time the write opera- 
structure, the semaphore operations are just the incre- tion will be sent out over the common bus. 
mem or decrement of the counter and manipulation of 65 The bus monitoring logic on every other CPU w ill 
the list of processes waiting for the semaphore. recognize the write operation and simultaneously up- 

Semaphores implemented in software can degrade date the corresponding lock bit in each local semaphore 

the performance of a multiprocessor computer system memory. 
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This ability to read spin locally relieves the common 
bus from the great amount of traffic that occurs in typi* 
cal systems that maintain the semaphore lock bits in 
common global memory. 

The invention will be more specifically described 5 
with relation to a preferred embodiment which is illus- 
trated in the appended drawings. 

DESCRIPTION OF THE DRAWINGS 

FIG. 1 shows a block diagram of a common bus 
multiprocessor computer system with semaphore mem- 
ory logic circuits on each CPU board. 

FIG. 2 shows a block circuit diagram of a CPU board 
containing semaphore memory logic consisting of bus 
monitoring logic for lock bit coherency and high speed 
SRAM for lock bit storage, constructed in accordance 
with an embodiment of the present invention. 

DESCRIPTION OF THE PREFERRED 

EMBODIMENT 20 

FIG. 1 illustrates the preferred embodiment of the 
invention consisting of a semaphore memory 2, 4, 6 on 
every CPU board 3, 5, 7 in a multiprocessor system. 
FIG. 2 illustrates the organization of the semaphore 
memory which includes bus monitoring logic 21 and 
high speed SRAM 23. 

In order to handle the concurrency condition of 2 or 
more CPU*s attempting to gain access to the same sema- 
phore at the same time, the spin loop is divided into two 
parts. A CPU wanting access to a semaphore performs 
a local read of the semaphore lock bit. If the bit is set, 
the CPU will spin locally, waiting for the bit to be reset. 
After the bit is reset, the CPU performs an indivisible 
read/modify /write operation by executing a locked test 35 
bit and set instruction. 

This involves locking the common bus so that no 
other CPU can access it. reading the state of the sema- 
phore lock bit to make sure it is still reset, and then 
setting the bit to gain access to the semaphore. If the bit 40 
was set by another CPU before the locked test and set 
could be performed, the CPU will again do a local read 
spin. The first CPU to gain locked access to the com- 
mon bus will gain access to the semaphore. 

This locked test and set operation is performed only 45 
after seeing that the semaphore has become avaiable. 

The SRAM 23 is used to store the semaphore lock 
bits locally and contains semaphores identical to every 
other CPU's SRAM. When the CPU 22 wishes to gain 
access to a data structure in global memory protected 50 
by a semaphore it must first perform a test bit and set 
operation on the particular semaphore lock bit. It does 
this by reading the lock bit from the SRAM 23 and 
checking its status. If the bit is set, the semaphore pro- 
tected data structure is in use by another CPU in the 33 
system. The CPU 22 will then spin in a tight loop until 
the bit is reset indicating the semaphore protected data 
structure is available. 

When the lock bit is reset, the CPU 22 will set it and 
w rite it back to the SRAM 23. At the same time, the bus 
monitoring logic 21 will forward this write from the 
local bus 30 to the common bus 40. 

The bus monitoring logic continually monitors the 
common bus 40 for writes to semaphore lock bits. 
Whenever a lock bit is written by any CPU, the bus 
monitoring logic 21 will write the same lock bit in its 
local SRAM 23. This ensures coherency of the lock bits 
throughout the system. 



4 

The bus monitoring logic 21 functions identically to 
the bus watching function implemented in common 
CACHE circuits and CACHE controllers found 
throughout the industry and can be implemented simi- 
larly with common CACHE components. 

The SRAM can be implemented with any number of 
standard SRAM components with an access speed suffi- 
cient to handle the common bus 40 bandwith. 

The machine instructions for causing the different 
read and write operations, the locks and semaphore 
operations depend on the type of microprocessor used. 
These instructions will be obvious to those skilled in the 
art of microprocessor programming. Therefore it is 
unnecessary to give a comprehensive list of microcode 
instructions. Furthermore it will be obvious to those 
skilled in the art that various modifications and changes 
may be made therein without departing from the inven- 
tion. 

What is claimed is: 

1. A multiprocessor system employing a semaphore 
mechanism, comprising: 

a global memory for storing semaphore protected 

data structures; 
a plurality of CPU boards, each one of said CPU 
boards containing a central processing unit and 
local themory for storing semaphores which are 
identical to semaphores stored in the local memory 
of each of the other of said CPU boards, each sema- 
phore including a lock bit which can be set and 
reset in dependence of whether a data structure in 
global men^ory is being used by a one of the central 
processing units; 
a common bus interconnecting said global memory 

and said plurality of CPU boards; 
spin loop means in each said central processing unit 
for gaining access to a given semaphore stored in 
local memory by continously checking the status of 
the local lock bit associated with the given sema- 
phore until it is determined that the local lock bit is 
reset, locking the common bus to prevent other 
central processing units from gaining access to said 
common bus, reading the local lock bit to ensure it 
is still reset, and thereafter setting the associated 
local lock bit for gaining access to the given sema- 
phore; and 

bus monitoring logic means on each said CPU board 
coupled to the central processing unit and the local 
memory of that CPU board for writing to said 
common bus the local set lock bit, and for monitor- 
ing the common bus for a set lock bit written to 
said common bus by another bus monitoring logic 
means on one of the other CPU boards and writing 
the set lock bit written onto said common bus by 
the other bus monitoring logic means into local 
memory to ensure coherency of the lock bits 
throughout said multiprocessor system. 

2. The multiprocessor system according to claim 1, 
wherein each said local memory comprises an SRAM. 

3. The multiprocessor system according to claim 1, 
further comprising a local bus on each respective CPU 
board interconnecting the central processing unit, local 
memory, and bus monitoring logic means on said re- 
spective CPU board, wherein the central processing 
unit on each said CPU board writes the local set lock bit 

65 to local memory by way of the local bus and said bus 
' monitoring logic means monitors said local bus for the 
local set lock bit. 
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